Search CORE

14 research outputs found

Speaker-Independent Mel-cepstrum Estimation from Articulator Movements Using D-vector Input

Author: Katsurada Kouichi
Richmond Korin
Publication venue: 'International Speech Communication Association'
Publication date: 25/10/2020
Field of study

Crossref

Edinburgh Research Explorer

New Grapheme Generation Rules for Two-Stage Modelbased Grapheme-to-Phoneme Conversion

Author: Iribe Yurie
Katsurada Kouichi
Kheang Seng
Nitta Tsuneo
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 20/12/2014
Field of study

The precise conversion of arbitrary text into its corresponding phoneme sequence (grapheme-to-phoneme or G2P conversion) is implemented in speech synthesis and recognition, pronunciation learning software, spoken term detection and spoken document retrieval systems. Because the quality of this module plays an important role in the performance of such systems and many problems regarding G2P conversion have been reported, we propose a novel two-stage model-based approach, which is implemented using an existing weighted finite-state transducer-based G2P conversion framework, to improve the performance of the G2P conversion model. The first-stage model is built for automatic conversion of words to phonemes, while the second-stage model utilizes the input graphemes and output phonemes obtained from the first stage to determine the best final output phoneme sequence. Additionally, we designed new grapheme generation rules, which enable extra detail for the vowel and consonant graphemes appearing within a word. When compared with previous approaches, the evaluation results indicate that our approach using rules focusing on the vowel graphemes slightly improved the accuracy of the out-of-vocabulary dataset and consistently increased the accuracy of the in-vocabulary dataset

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal

転置畳み込みニューラルネットワークを用いたrtMRIデータからの調音-音響変換

Author: Hidefumi OHMURA
Kouichi KATSURADA
Ryo TANJI
Shun SAWADA
丹治涼
大村英史
桂田浩一
澤田隼
Publication venue: 国立国語研究所
Publication date: 01/01/2021
Field of study

Tokyo University of ScienceTokyo University of ScienceTokyo University of ScienceTokyo University of Science会議名: 言語資源活用ワークショップ2021, 開催地: オンライン, 会期: 2021年9月13日-14日, 主催: 国立国語研究所コーパス開発センター本稿では，rtMRIデータから音響特徴量を生成するための深層学習モデルを提案する。調音器官全体を高解像度で記録できるrtMRIは，調音データから音響特徴量を生成するための元データとして有用であると考えられるが，フレームレートが比較的低いという問題がある。そこで我々は，転置畳み込みネットワークを用いて時間軸方向に超解像処理を行う方法を提案する。標準的な畳み込みニューラルネットワークが畳み込みによって主に画像の近隣情報を圧縮するのに対して，転置畳み込みネットワークではこの逆の操作を行うことにより，画像の解像度を向上させる。本手法ではこの超解像処理をrtMRIデータの時間方向に適用することによって，rtMRIデータの時間解像度を向上させる。メルケプストラム歪みとPESQを評価尺度として用いた実験の結果，転置畳み込みネットワークは正確な音響特徴量の生成に有効であることがわかった。また，超解像処理の倍率を上げることで，PESQのスコアが向上することも確認した

Academic Repository of the National Institute for Japanese Language and Linguistics / 国立国語研究所学術情報リポジトリ

Using Reversed Sequences and Grapheme Generation Rules to Extend the Feasibility of a Phoneme Transition Network-Based Grapheme-to-Phoneme Conversion

Author: Kouichi KATSURADA
Seng KHEANG
Tsuneo NITTA
Yurie IRIBE
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/01/2016
Field of study

Crossref

A Model of Belief Formation Based on Causality and Application to N-armed Bandit Problem

Author: Kouichi Katsurada
Ryo Taguchi
Shuji Shinohara
Tsuneo Nitta
Publication venue: 'Japanese Society for Artificial Intelligence'
Publication date: 01/01/2007
Field of study

Crossref

Development of a Toolkit for Spoken Dialog Systems with an Anthropomorphic Agent: Galatea

Author: Katsurada Kouichi
Kawahara Tatsuya
Lee Akinobu
Morishima Shigeo
Nishimoto Takuya
Nitta Tsuneo
Yamashita Yoichi
Yotsukura Tatsuo
Publication venue: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, International Organizing Committee
Publication date: 04/10/2009
Field of study

The Interactive Speech Technology Consortium (ISTC) has been developing a toolkit called Galatea that comprises four fundamental modules for speech recognition, speech synthesis, face synthesis, and dialog control, that can be used to realize an interface for spoken dialog systems with an anthropomorphic agent. This paper describes the development of the Galatea toolkit and the functions of each module; in addition, it discusses the standardization of the description of multi-modal interactions.APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference. 4-7 October 2009. Sapporo, Japan. Oral session: Infrastructure Software for Speech Processing (5 October 2009)

Hokkaido University Collection of Scholarly and Academic Papers